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STANDARDIZATION OF THE CONSTRUCTION OF 
STATISTICAL TABLES 

By Edmund E. Day, Harvard University 



The progress of every art should be marked by the accumu- 
lation of an increasing stock of generally accepted practices. 
As these practices obtain common approval, they should be 
recognized as standard and regularly followed until more 
satisfactory methods are discovered. A measure of standardi- 
zation is thus a normal feature of development. 

Standardization nevertheless has its dangers. Like "law 
and order " in civil life, it may be overdone. There is always 
the risk of formalism. But kept within proper limits, 
standardization has a steadying influence which tends to 
accelerate, not retard, the advancement of the art. It effects 
good order, and is an unmistakable mark of real progress. 
It is consequently profitable to consider from time to time 
the extent to which standardization can advantageously be 
accepted. In statistical exposition, the standardization of 
graphic methods has been one of the gratifying advances of 
recent years. To what extent has there been and to what 
extent are there further opportunities for a similar standardi- 
zation of practice in the methods of tabular presentation? 

In considering this question, it should not be thought that 
standardization is accomplished only through the conscious 
adoption of rules and regulations set up by recognized organs 
of authority. Standardized statistical practices may evolve 
by imperceptible degrees through the influences of imitation 
and prestige. This is particularly the case if some one statisti- 
cal bureau is the fountain-head of governmental practice. 
The working rules of such an office tend to become the rules 
of a following of less influential practitioners. Standardiza- 
tion of this kind is going on at all times. Such standardiza- 
tion of practice as we have today in statistical work in this 
country is almost altogether the result of these influences of 
imitation and prestige. 
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Unconscious standardization of this sort has already made 
substantial progress with regard to the structure of statistical 
tables. Without attempting a complete enumeration of the 
rules observed by competent authorities, a few of the standard 
practices may be noted in passing. It is generally recognized: 

(1) that every table should be self-sufficing, containing within 
itself a clear explanation of the meaning of the items displayed; 

(2) that every table should be logically a unit, containing 
only data which are intimately related with one another; (3) 
that column- and row-headings should be brief, unambiguous, 
and self-explanatory, table' footnotes being used when neces- 
sary to make the headings perfectly clear; (4) that coordinate 
and subordinate relationships among the column- and row- 
headings should be shown by variations of boxing in the cap- 
tions and of indentation in the stub; (5) that varieties of 
letters, figures, lines, column-widths, and interlinear spacings 
should be employed to facilitate easy and intelligent use of 
the table; (6) that columns and rows should be lettered or 
numbered if cross reference is desirable; and (7) that sources 
and units should invariably be indicated. The common 
acceptance of these principles represents no mean advance in 
the standardization of statistical table structure. 

It is to be observed, however, that the standardization thus 
far effected concerns primarily the constituent parts of the 
table, not the table's general form. The choice of position 
between columns and rows, the arrangement of the several 
columns or the several rows, and the location of particular 
columns toward the left of the table or of particular rows 
toward the top, seem still to be matters of individual prefer- 
ence, if not of chance. It is important to consider how far 
standardization of the general form of statistical tables is 
feasible and desirable. 

Standardization of the general form of statistical tables 
must begin with a distinction between general-purpose and 
special-purpqse tables. The general-purpose table is designed 
to bring together in most convenient and accessible form all 
the data bearing upon a given topic. The special-purpose 
table is intended to throw into relief relationships of special 
significance in a given study. The general-purpose table is 
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an orderly presentation of statistical material; the special- 
purpose table, a record of the results of statistical analysis. 
Of course, a measure of analysis is a prerequisite even of the 
general-purpose table, but the analysis is of a different order. 
It is the analysis essential to effective enumeration and tabu- 
lation, not the analysis accompanying specific interpretation. 
The analysis required for the special-purpose table is directed 
toward a particular issue. The problems of good table 
structure are essentially different for the two types of tables. 

Since the construction of the general-purpose table is the 
simpler case, it will be examined first. In considerable meas- 
ure, the general-purpose, or primary, table is a creature of 
the physical form of the medium in which it appears. Upon 
the one hand, the table tends to expand to accommodate the 
large body of data pressing for inclusion. Upon the other 
hand, the capacity of the printed page — even if it be folio — 
stands as a limit on the indefinite enlargement of the table. 
Tables which are allowed to exceed the dimensions of the page 
and have to be folded in are everywhere recognized as objec- 
tionable. Loose tables, separately printed in large irregular 
sizes, are as bad, if not worse. Tables running across two 
pages facing one another are reasonably satisfactory, but are 
to be avoided where possible. Tables which are presented 
at right angles to the text fall into the same class. In general, 
the single page, held as when reading the text, is the maximum 
size to which the statistical table should be permitted to run. 
Primary tables usually press upon this physical limit; their 
outside dimensions are thus independently determined. 

Within the table, similar influences are at work. Whether 
given arrays of data shall be exhibited in columns or in rows 
is commonly a question of the difference in the vertical and 
horizontal capacity of the page. The maximum number of 
lines in a table is several times greater than the maximum 
number of columns. Consequently the arrays having the 
greatest number of items are naturally assigned to the columns, 
the other arrays to the rows. Once a given set of headings 
has appeared in caption- or stub-position, there is a strong 
presumption in favor of its occupying the same position in 
other related tables, for the transcription of data from general 
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tables is thereby facilitated. Upon the whole, however, the 
assignment of columns and rows rests fundamentally upon 
the greater capacity of the column — a factor not subject to 
modification by the statistician. 

A much larger measure of option may be exercised in fixing, 
in a general-purpose table, the order of columns and of rows. 
Almost any systematic plan may be adopted; but the most 
satisfactory arrangements are the alphabetical, chronological, 
geographical, or according to the magnitude of the items. 
There are no grounds for urging the adoption of any one or 
two of these arrangements to the exclusion of the others. Now 
one best serves; now another. One rule, however, should 
govern the final selection in all cases: that order should be 
employed which keeps the details of the table most generally 
accessible. Readers will come to the table with a variety of 
interests. They should be given that table from which in 
general they can most easily draw the information they seek. 
Arrangement according to magnitude or importance of items 
is less satisfactory in general-purpose, than in special-purpose 
tables, because it depends upon analysis from a single point of 
view and it is frequently unwise to commit the table to this 
particular viewpoint. The other arrangements better meet 
the variety of needs which a primary table is designed to serve. 
The important end is to secure some logically and com- 
monly understood arrangement which opens the table to easy 
transcription. 

When geographical or chronological orders are adopted, a 
decision has to be reached as to what items to place at the 
top and left, and what items at the bottom and right. In 
the tabular arrangement of the states of this country, the 
grouping and order followed by the Bureau of the Census may 
be recognized as standard; the northern New England states 
stand at the head of the list, the southern Pacific states at the 
foot. In general, the best statistical practice for this country 
would seem to be to run geographical series from north to 
south and from east to west. With chronological series the 
case is not so clear. Upon the whole, however, for general- 
purpose tables, the Census Bureau practice of placing most 
recent dates at the top and left seems commendable if there 
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is a fair presumption that the figures of most recent date will 
be most frequently transcribed. When, however, the data 
probably will be transcribed in entirety as time series, it would 
seem preferable to place the figures for earlier dates toward 
the top and left. The rule to apply in all these cases is simple : 
the most generally useful data should be located toward the 
top and left where accurate transcription is rendered easier 
by close proximity to the column- and row-headings. 

The general or primary table exhibits no specific analysis. 
Its form is in considerable measure the resultant of the physi- 
cal limitations of the page and the necessity of presenting a 
maximum body of data in a way that will make the most 
generally useful parts most readily accessible. The derived or 
analytical table is a different statistical device. A derived 
table is essentially deficient if it fails to exhibit a carefully 
formulated analysis. It should be constructed to assist a 
specific interpretation. Every effort should be made to make 
the table simple. It should contain only those items valuable 
to the analysis, arranged so as to encourage the deductions 
the reader is expected to draw. If any line is to be traced 
between statistical tabulation and statistical analysis, the 
primary table displays the results of tabulation, the derived 
table the results of analysis. 

Despite this fundamental distinction between primary and 
derived tables, it is to be admitted in the first place that the de- 
rived table is not altogether free from the influences of format 
which play so important a part in shaping the primary table. 
For example, if the number of subdivisions in one classifica- 
tion of an analysis is much greater than in the other, it may 
be necessary to put the more extended classification in the 
stub merely because stub-capacity is normally so much greater 
than caption-capacity. Similarly, if the designations in one 
classification are much longer than in the other, it may be 
necessary to place the classification with longer headings in 
the stub, since neither of the alternatives — printing the longer 
headings vertically at the top of the columns, or widening the 
columns to accommodate the longer headings horizontally — 
is at all satisfactory. Such crass considerations as these are 
at times decisive in determining the structure even of the 
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derived table. But they play a much less inportant part with 
the derived table than with the primary table. As a rule, the 
statistician is able to make the general form of the derived 
table serve the exposition in hand. 

One of the most fundamental questions of structure is the 
assignment of data to columns in some instances, to rows in 
others. This matter should be settled in the derived table 
with reference to what comparisons it is most important to 
present. Comparison of like items in a column is much easier 
than of like items in a row. It is believed that recognition of 
this fact will commonly throw chronological, geographical, 
and quantitative classifications into the stub, and qualitative 
classifications into the caption; but this is not a necessary 
outcome. The important principle is to use the column 
position to promote the more significant comparison. 

Arrangement of the several columns and of the several rows 
in the derived table will be determined by the particular 
character of the analysis in connection with which the table 
is employed. If the analysis is of a temporal distribution, 
a chronological order will be adopted; if of a spacial distribu- 
tion, a geographical order. If the items are component parts 
of an aggregate, arrangement will be either according to the 
relative magnitude or importance of the item, or according 
to some other order generally recognized in the analysis of the 
data in question. Presumably the alphabetical arrangement 
will seldom be followed, since it does not directly disclose 
significant relationships. Ordinarily the purpose of the 
analysis will indicate clearly enough the order in which the 
columns or the rows should be placed. 

Naturally, the arrangement of columns and rows should 
give proper regard to the fact that the most conspicuous posi- 
tion in a statistical table is at the top and left. While it is 
generally true that derived tables are designed to bring out 
relationships rather than individual items and that these 
relationships are properties of the table as a whole rather than 
of particular parts, it may be desirable in some tables to focus 
attention especially upon certain more important items. 
When other considerations will permit, these more important 
items should be placed in the most exposed positions of the 
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table: namely, at the top and left next to the captions and 
stub. This rule is a sufficient warrant for placing totals at 
the top and left when they are clearly the most significant 
items of the tabulation, and when placing them at the top and 
left will not give serious offense to the users of the table. If 
either of these conditions is not present, it would seem prefer- 
able to place totals in the positions in which most readers 
expect to find them, namely, at the bottom and right. There 
appears to be no adequate reason for departing from the estab- 
lished practice of reading time from top to bottom and left to 
right. In derived tables, figures for later dates should appear 
toward the bottom and right. It is the relation between 
items, not the individual item, which is significant in time 
series. For many reasons we are accustomed to think of 
the upper or left-hand of two figures as the earlier, and 
we draw our conclusions accordingly. Furthermore, this rule 
is already thoroughly incorporated in our graphic practices. 
To have diametrically different rules for graphic and tabular 
presentation would be unfortunate. The Census Bureau 
practice of placing data for most recent dates at the top and 
left is therefore not to be approved for the derived table. 
Effective exposition of the statistical evidences is better served 
by the order which seems most natural to the great majority 
of readers. Arrangements of columns and rows should hold 
fast to the purpose of facilitating interpretation. 

If the dominant purpose of the derived table be kept in 
mind, many problems of tabular arrangement will be readily 
solved. Percentage distributions will be placed next to the 
corresponding absolute figures or in a separate portion of the 
table, according to the emphasis of the analysis. To facilitate 
comparisons of relationship, the arrangements adopted in one 
table of an analysis will be followed as closely in the other 
tables as other more important considerations will permit. 
Columns and rows which are to be compared with one another 
will be brought as closely together as possible. Unnecessary 
digits will be dropped and items given in round numbers to 
simplify the presentation. The aim throughout will be to 
make the derived table an effective instrument of statistical 
exposition. 
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If such are the considerations involved in the construction 
of statistical tables, what conclusions are to be drawn regard- 
ing the possibilities of standardization of table structure? 
Upon the whole, the opportunities for complete standardiza- 
tion seem slight except with regard to the elements from which 
the table is to be constructed, and certain lesser matters of 
general arrangement. More is to be gained at this time from 
a clear recognition of important guiding principles in table 
construction. Careful attention must be paid to the difference 
of purpose in primary and derived tables. The primary 
table must be made to offer its items for easy transcription; 
the derived table, for ready deduction. If statistical tables 
are formed with nice regard for these fundamental aims of 
tabular presentation, standardization may well be allowed to 
proceed as it has heretofore, through imitation of the most 
satisfactory existing practices. Untiring experiment with 
varying forms and ready acceptance of improvements are for 
the present the most promising means of securing better 
construction of statistical tables. 



